phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data

نویسندگان

  • Paul J. McMurdie
  • Susan Holmes
چکیده

BACKGROUND the analysis of microbial communities through dna sequencing brings many challenges: the integration of different types of data with methods from ecology, genetics, phylogenetics, multivariate statistics, visualization and testing. With the increased breadth of experimental designs now being pursued, project-specific statistical analyses are often needed, and these analyses are often difficult (or impossible) for peer researchers to independently reproduce. The vast majority of the requisite tools for performing these analyses reproducibly are already implemented in R and its extensions (packages), but with limited support for high throughput microbiome census data. RESULTS Here we describe a software project, phyloseq, dedicated to the object-oriented representation and analysis of microbiome census data in R. It supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics; all in a manner that is easy to document, share, and modify. We show how to apply functions from other R packages to phyloseq-represented data, illustrating the availability of a large number of open source analysis techniques. We discuss the use of phyloseq with tools for reproducible research, a practice common in other fields but still rare in the analysis of highly parallel microbiome census data. We have made available all of the materials necessary to completely reproduce the analysis and figures included in this article, an example of best practices for reproducible research. CONCLUSIONS The phyloseq project for R is a new open-source software package, freely available on the web from both GitHub and Bioconductor.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shiny-phyloseq: Web application for interactive microbiome analysis with provenance tracking

UNLABELLED We have created a Shiny-based Web application, called Shiny-phyloseq, for dynamic interaction with microbiome data that runs on any modern Web browser and requires no programming, increasing the accessibility and decreasing the entrance requirement to using phyloseq and related R tools. Along with a data- and context-aware dynamic interface for exploring the effects of parameter and ...

متن کامل

Reproducible Research Workflow in R for the Analysis of Personalized Human Microbiome Data

This article presents a reproducible research workflow for amplicon-based microbiome studies in personalized medicine created using Bioconductor packages and the knitr markdown interface.We show that sometimes a multiplicity of choices and lack of consistent documentation at each stage of the sequential processing pipeline used for the analysis of microbiome data can lead to spurious results. W...

متن کامل

Advancing our understanding of the human microbiome using QIIME.

High-throughput DNA sequencing technologies, coupled with advanced bioinformatics tools, have enabled rapid advances in microbial ecology and our understanding of the human microbiome. QIIME (Quantitative Insights Into Microbial Ecology) is an open-source bioinformatics software package designed for microbial community analysis based on DNA sequence data, which provides a single analysis framew...

متن کامل

phyloseq: A Bioconductor Package for Handling and Analysis of High-Throughput Phylogenetic Sequence Data

We present a detailed description of a new Bioconductor package, phyloseq, for integrated data and analysis of taxonomically-clustered phylogenetic sequencing data in conjunction with related data types. The phyloseq package integrates abundance data, phylogenetic information and covariates so that exploratory transformations, plots, and confirmatory testing and diagnostic plots can be carried ...

متن کامل

Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible

Current practice in the normalization of microbiome count data is inefficient in the statistical sense. For apparently historical reasons, the common approach is either to use simple proportions (which does not address heteroscedasticity) or to use rarefying of counts, even though both of these approaches are inappropriate for detection of differentially abundant species. Well-established stati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013